Search CORE

37 research outputs found

On the existence of identifiable reparametrizations for linear compartment models

Author: Baaijens Jasmijn A.
Draisma Jan
Publication venue
Publication date: 01/01/2015
Field of study

The parameters of a linear compartment model are usually estimated from experimental input-output data. A problem arises when infinitely many parameter values can yield the same result; such a model is called unidentifiable. In this case, one can search for an identifiable reparametrization of the model: a map which reduces the number of parameters, such that the reduced model is identifiable. We study a specific class of models which are known to be unidentifiable. Using algebraic geometry and graph theory, we translate a criterion given by Meshkat and Sullivant for the existence of an identifiable scaling reparametrization to a new criterion based on the rank of a weighted adjacency matrix of a certain bipartite graph. This allows us to derive several new constructions to obtain graphs with an identifiable scaling reparametrization. Using these constructions, a large subclass of such graphs is obtained. Finally, we present a procedure of subdividing or deleting edges to ensure that a model has an identifiable scaling reparametrization

arXiv.org e-Print Archive

Repository TU/e

CWI's Institutional Repository

Pure OAI Repository

Bern Open Repository and Information System (BORIS)

De novo approaches to haplotype-aware genome assembly

Author: Baaijens J.A. (Jasmijn)
Publication venue
Publication date: 25/09/2019
Field of study

CWI's Institutional Repository

Strain-aware assembly of genomes from mixed samples using flow variation graphs

Author: Baaijens J.A. (Jasmijn)
Schönhuth A. (Alexander)
Stougie L. (Leen)
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

The goal of strain-aware genome assembly is to reconstruct all individual haplotypes from a mixed sample at the strain level and to provide abundance estimates for the strains

VU Research Portal

Crossref

CWI's Institutional Repository

INRIA a CCSD electronic archive server

De novo assembly of viral quasispecies using overlap graphs

Author: Aabidine Amal Zine El
Baaijens Jasmijn A.
Rivals Eric
Schönhuth Alexander
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 01/01/2017
Field of study

Baaijens JA, Aabidine AZE, Rivals E, Schönhuth A. De novo assembly of viral quasispecies using overlap graphs. Genome Research. 2017;27(5):835-848

Publications at Bielefeld University

Full-length de novo viral quasispecies assembly through variation graph construction

Author: Baaijens J.A. (Jasmijn)
Köster J. (Johannes)
Roest B. (Bastiaan) van der
Schönhuth A. (Alexander)
Stougie L. (Leen)
Publication venue: 'Oxford University Press (OUP)'
Publication date: 15/12/2019
Field of study

CWI's Institutional Repository

Full-length de novo viral quasispecies assembly through variation graph construction

Author: Baaijens Jasmijn,
Koester Johannes
Schoenhuth Alexander
Stougie Leen
Van Der Roest Bastiaan
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 19/04/2018
Field of study

International audienceMotivation: Viruses populate their hosts as a viral quasispecies: a collection of genetically related mutant strains.Viral quasispecies assembly refers to reconstructing the strain-specific haplotypes from read data, and predicting their relative abundances within the mix of strains, an important step for various treatment-related reasons. Reference-genome-independent ("de novo") approaches have yielded benefits over reference-guided approaches, because reference-induced biases can become overwhelming when dealing with divergent strains. While being very accurate, extant de novo methods only yield rather short contigs. It remains to reconstruct full-length haplotypes together with their abundances from such contigs. Method: We first construct a variation graph, a recently popular, suitable structure for arranging and integrating several related genomes, from the short input contigs, without making use of a reference genome. To obtain paths through the variation graph that reflect the original haplotypes, we solve a minimization problem that yields a selection of maximal-length paths that is optimal in terms of being compatible with the read coverages computed for the nodes of the variation graph. We output the resulting selection of maximal length paths as the haplotypes, together with their abundances. Results: Benchmarking experiments on challenging simulated data sets show significant improvements in assembly contiguity compared to the input contigs, while preserving low error rates. As a consequence, our method outperforms all state-of-the-art viral quasispecies assem-blers that aim at the construction of full-length haplotypes, in terms of various relevant assembly measures. Our tool, Virus-VG, is publicly available at https://bitbucket.org/jbaaijens/ virus-vg

INRIA a CCSD electronic archive server

Lineage Abundance Estimation for SARS-CoV-2 in Wastewater Using Transcriptome Quantification Techniques

Effectively monitoring the spread of SARS-CoV-2 mutants is essential to efforts to counter the ongoing pandemic. Predicting lineage abundance from wastewater, however, is technically challenging. We show that by sequencing SARS-CoV-2 RNA in wastewater and applying algorithms initially used for transcriptome quantification, we can estimate lineage abundance in wastewater samples. We find high variability in signal among individual samples, but the overall trends match those observed from sequencing clinical samples. Thus, while clinical sequencing remains a more sensitive technique for population surveillance, wastewater sequencing can be used to monitor trends in mutant prevalence in situations where clinical sequencing is unavailable

PubMed Central

University of Nebraska Medical Center Research: DigitalCommons@UNMC

Computational pan-genomics: status, promises and challenges

Author: Abeel Thomas
Alkan Can
Baaijens Jasmijn
Bakker Paul
Boeva Valentina
Bonnal Raoul
Chiaromonte Francesca
Chikhi Rayan
Ciccarelli Francesca
Cijvat Robin
Datema Erwin
Dijkstra Louis
Duijn Cornelia
Dutilh Bas
Eichler Evan
El-Kebir Mohammed
Ernst Corinna
Eskin Eleazar
Garrison Erik
Ghaffaari Ali
Guryev Victor
Kersey Paul
Klau Gunnar
Kloosterman Wigard
Korbel Jan
Lameijer Eric-Wubbo
Langmead Benjamin
Marschall Tobias
Martin Marcel
Marz Manja
Medvedev Paul
Mu John
Mäkinen Veli
Neerincx Pieter
Novak Adam
Ouwens Klaasjan
Paten Benedict
Peterlongo Pierre
Pisanti Nadia
Porubsky David
Rahmann Sven
Raphael Benjamin
Reinert Knut
Ridder Dick
Ridder Jeroen
Rivals Eric
Sanders Ashley
Schlesner Matthias
Schulz-Trieglaff Ole
Schönhuth Alexander
Sheikhizadeh Siavash
Shneider Carl
Smit Sandra
The Computational Pan-Genomics Consortium
Valenzuela Daniel
Vandin Fabio
Wang Jiayin
Wessels Lodewyk
Ye Kai
Zhang Ying
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2018
Field of study

International audienceMany disciplines, from human genetics and oncology to plant breeding, microbiology and virology, commonly face the challenge of analyzing rapidly increasing numbers of genomes. In case of Homo sapiens, the number of sequenced genomes will approach hundreds of thousands in the next few years. Simply scaling up established bioinformatics pipelines will not be sufficient for leveraging the full potential of such rich genomic data sets. Instead, novel, qualitatively different computational methods and paradigms are needed. We will witness the rapid extension of computational pan-genomics, a new sub-area of research in computational biology. In this article, we generalize existing definitions and understand a pan-genome as any collection of genomic sequences to be analyzed jointly or to be used as a reference. We examine already available approaches to construct and use pan-genomes, discuss the potential benefits of future technologies and methodologies and review open challenges from the vantage point of the above-mentioned biological disciplines. As a prominent example for a computational paradigm shift, we particularly highlight the transition from the representation of reference genomes as strings to representations as graphs. We outline how this and other challenges from different application domains translate into common computational problems, point out relevant bioinformatics techniques and identify open problems in computer science. With this review, we aim to increase awareness that a joint approach to computational pan-genomics can help address many of the problems currently faced in various domains

INRIA a CCSD electronic archive server

Archivio della Ricerca - Università di Pisa

EUR Research Repository

HAL-MINES ParisTech

Archivio della ricerca della Scuola Superiore Sant'Anna

Radboud Repository

HAL-Rennes 1

A high-quality human reference panel reveals the complexity and distribution of genomic structural variants

Author: Abdellaoui A. (Abdel)
Amin N. (Najaf)
Baaijens J.A. (Jasmijn)
Bakker P.I.W. (Paul) de
Beekman M. (Marian)
Boomsma D.I. (Dorret)
Bot J. (Jan)
Bovenberg J.A. (Jasper)
Byelas G. (George)
Cao H. (Hongzhi)
Cao J.S. (Jeremy Sujie)
Cao R. (Rui)
Chen R. (Ruoyan)
Coe B.P. (Bradley)
Craen A.J.M. (Anton) de
Deelen P. (Patrick)
Dijk F. (Freerk) van
Dijkstra L.J. (Louis)
Dijkstra M. (Martijn)
Du Y. (Yuanping)
Duijn C.M. (Cornelia) van
Dunnen J.T. (Johan) den
Eichler E.E. (Evan)
Enckevort D. (David) van
Estrada K. (Karol)
Francioli L.C. (Laurent)
Guryev V. (Victor)
Handsaker R.E. (Robert)
Hehir-Kwa J.Y. (Jayne)
Hofman A. (Albert)
Hormozdiari F. (Fereydoun)
Hottenga J.-J. (Jouke-Jan)
Kanterakis A. (Alexandros)
Karssen L.C. (Lennart)
Kattenberg V.M. (Mathijs)
Kloosterman W.P. (Wigard)
Knijff P. (Peter) de
Ko A. (Arthur)
Koval V. (Vyacheslav)
Lameijer E.-W. (Eric-Wubbo)
Laros J.F.J. (Jeroen)
Ligt J. (Joep) de
Marschall T. (Tobias)
McCarroll S.A. (Steven)
Mei H. (Hailiang)
Neerincx P.B.T. (Pieter)
Nijman I.J. (Isaac)
Ommen G.-J.B. (Gert-Jan) van
Platteel M. (Mathieu)
Renkens I. (Ivo)
Rivadeneira F. (Fernando)
Santcroos M. (Mark)
Schaik B.D.C. (Barbera) van
Schönhuth A. (Alexander)
Slagboom P.E. (Eline)
Sudmant P. (Peter)
Sun Y. (Yushen)
Swertz M.A. (Morris)
Thung (), D.T. (Djie Tjwan)
Uitterlinden A.G. (André)
van Leeuwen E.M. (Elisa)
Vermaat M. (Martijn)
Wardenaar R. (René)
Wijmenga C. (Cisca)
Willemsen G. (Gonneke)
Wolffenbuttel B. (Bruce)
Ye K. (Kai)
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 06/10/2016
Field of study

Structural variation (SV) represents a major source of differences between individual human genomes and has been linked to disease phenotypes. However, the majority of studies provide neither a global view of the full spectrum of these variants nor integrate them into reference panels of genetic variation. Here, we analyse whole genome sequencing data of 769 individuals from 250 Dutch families, and provide a haplotype-resolved map of 1.9 million genome variants across 9 different variant classes, including novel forms of complex indels, and retrotransposition-mediated insertions of mobile elements and processed RNAs. A large proportion are previously under reported variants sized between 21 and 100 bp. We detect 4 megabases of novel sequence, encoding 11 new transcripts. Finally, we show 191 known, trait-associated SNPs to be in strong linkage disequilibrium with SVs and demonstrate that our panel facilitates accurate imputation of SVs in unrelated individuals

CWI's Institutional Repository